11 research outputs found
A Simple Correlation-Based Model of Intelligibility for Nonlinear Speech Enhancement and Separation
Applying a binary mask to a pure noise signal can result in speech that is highly intelligible, despite the absence of any of the target speech signal. Therefore, to estimate the intelligibility benefit of highly nonlinear speech enhancement techniques, we contend that SNR is not useful; instead we propose a measure based on the similarity between the time-varying spectral envelopes of target speech and system output, as measured by correlation. As with previous correlation-based intelligibility measures, our system can broadly match subjective intelligibility for a range of enhanced signals. Our system, however, is notably simpler and we explain the practical motivation behind each stage. This measure, freely available as a small Matlab implementation, can provide a more meaningful evaluation measure for nonlinear speech enhancement systems, as well as providing a transparent objective function for the optimization of such systems
Model-based Speech Enhancement for Intelligibility Improvement in Binaural Hearing Aids
Speech intelligibility is often severely degraded among hearing impaired
individuals in situations such as the cocktail party scenario. The performance
of the current hearing aid technology has been observed to be limited in these
scenarios. In this paper, we propose a binaural speech enhancement framework
that takes into consideration the speech production model. The enhancement
framework proposed here is based on the Kalman filter that allows us to take
the speech production dynamics into account during the enhancement process. The
usage of a Kalman filter requires the estimation of clean speech and noise
short term predictor (STP) parameters, and the clean speech pitch parameters.
In this work, a binaural codebook-based method is proposed for estimating the
STP parameters, and a directional pitch estimator based on the harmonic model
and maximum likelihood principle is used to estimate the pitch parameters. The
proposed method for estimating the STP and pitch parameters jointly uses the
information from left and right ears, leading to a more robust estimation of
the filter parameters. Objective measures such as PESQ and STOI have been used
to evaluate the enhancement framework in different acoustic scenarios
representative of the cocktail party scenario. We have also conducted
subjective listening tests on a set of nine normal hearing subjects, to
evaluate the performance in terms of intelligibility and quality improvement.
The listening tests show that the proposed algorithm, even with access to only
a single channel noisy observation, significantly improves the overall speech
quality, and the speech intelligibility by up to 15%.Comment: after revisio
The He-rich core-collapse supernova 2007Y: Observations from X-ray to Radio Wavelengths
A detailed study spanning approximately a year has been conducted on the Type
Ib supernova 2007Y. Imaging was obtained from X-ray to radio wavelengths, and a
comprehensive set of multi-band (w2m2w1u'g'r'i'UBVYJHKs) light curves and
optical spectroscopy is presented. A virtually complete bolometric light curve
is derived, from which we infer a (56)Ni-mass of 0.06 M_sun. The early spectrum
strongly resembles SN 2005bf and exhibits high-velocity features of CaII and
H_alpha; during late epochs the spectrum shows evidence of a ejecta-wind
interaction. Nebular emission lines have similar widths and exhibit profiles
that indicate a lack of major asymmetry in the ejecta. Late phase spectra are
modeled with a non-LTE code, from which we find (56)Ni, O and total-ejecta
masses (excluding He) to be 0.06, 0.2 and 0.42 M_sun, respectively, below 4,500
km/s. The (56)Ni mass confirms results obtained from the bolometric light
curve. The oxygen abundance suggests the progenitor was most likely a ~3.3
M_sun He core star that evolved from a zero-age-main-sequence mass of 10-13
M_sun. The explosion energy is determined to be ~10^50 erg, and the mass-loss
rate of the progenitor is constrained from X-ray and radio observations to be
<~10^-6 M_sun/yr. SN 2007Y is among the least energetic normal Type Ib
supernovae ever studied.Comment: Corrected error in Tab. 2 & 3. Photometry has not change
An evaluation of objective measures for intelligibility prediction of time-frequency weighted noisy speech
Existing objective speech-intelligibility measures are suitable for several types of degradation, however, it turns out that they are less appropriate in cases where noisy speech is processed by a time-frequency weighting. To this end, an extensive evaluation is presented of objective measure for intelligibility prediction of noisy speech processed with a technique called ideal time frequency (TF) segregation. In total 17 measures are evaluated, including four advanced speech-intelligibility measures (CSII, CSTI, NSEC, DAU), the advanced speech-quality measure (PESQ), and several frame-based measures (e.g., SSNR). Furthermore, several additional measures are proposed. The study comprised a total number of 168 different TF-weightings, including unprocessed noisy speech. Out of all measures, the proposed frame-based measure MCC gave the best results (qÂĽ0.93). An additional experiment shows that the good performing measures in this study also show high correlation with the intelligibility of single-channel noise reduced speech.MediamaticsElectrical Engineering, Mathematics and Computer Scienc